Deep Forest and Pruned Syntax Tree-Based Classification Method for Java Code Vulnerability

نویسندگان

چکیده

The rapid development of J2EE (Java 2 Platform Enterprise Edition) has brought unprecedented severe challenges to vulnerability mining. current abstract syntax tree-based source code classification method does not eliminate irrelevant nodes when processing the tree, resulting in a long training time and overfitting problems. Another problem is that different structures will be translated same sequence tree trees using depth-first traversal, so this process, algorithm lead loss semantic structure information which reduce accuracy model. Aiming at these two problems, we propose deep forest pruned (PSTDF) for Java vulnerability. First, breadth-first traversal obtains statement trees, next, pruning removes nodes, then use based encoder obtain vector, finally, as classifier get results. Experiments on publicly accessible datasets show PSTDF can effectively remove impact redundant information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Object-Based Classification of UltraCamD Imagery for Identification of Tree Species in the Mixed Planted Forest

This study is a contribution to assess the high resolution digital aerial imagery for semi-automatic analysis of tree species identification. To maximize the benefit of such data, the object-based classification was conducted in a mixed forest plantation. Two subsets of an UltraCam D image were geometrically corrected using aero-triangulation method. Some appropriate transformations were perfor...

متن کامل

Forest Stand Types Classification Using Tree-Based Algorithms and SPOT-HRG Data

Forest types mapping, is one of the most necessary elements in the forest management and silviculture treatments. Traditional methods such as field surveys are almost time-consuming and cost-intensive. Improvements in remote sensing data sources and classification –estimation methods are preparing new opportunities for obtaining more accurate forest biophysical attributes maps. This research co...

متن کامل

Improving Abstract Syntax Tree based Source Code Change Detection

This document sets the direction for my diploma thesis on the subject how applying similarity measures might improve abstract syntax tree based source code change detection. It defines the main tasks, as well as the envisioned outcome of my work and serves as tentative schedule by specifying work packages, i. e. Milestones, and associated deadlines. The remainder of this document is organized a...

متن کامل

object-based classification of ultracamd imagery for identification of tree species in the mixed planted forest

this study is a contribution to assess the high resolution digital aerial imagery for semi-automatic analysis of tree species identification. to maximize the benefit of such data, the object-based classification was conducted in a mixed forest plantation. two subsets of an ultracam d image were geometrically corrected using aero-triangulation method. some appropriate transformations were perfor...

متن کامل

Machine Learning Based Source Code Classification Using Syntax Oriented Features

As of today the programming language of the vast majority of the published source code is manually specified or programmatically assigned based on the sole file extension. In this paper we show that the source code programming language identification task can be fully automated using machine learning techniques. We first define the criteria that a production-level automatic programming language...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2023

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math11020461